Applications of Approximate Word Matching in InformationRetrievalJames

نویسندگان

  • James C. French
  • Allison L. Powell
  • Eric Schulman
چکیده

As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the eeective retrieval of information. The need to discover and reconcile variant forms of strings in bibliographic entries, i.e., authority work, will become more diicult. Spelling variants, misspellings, and transliteration diierences will all increase the diiculty of retrieving information. Approximate string matching has traditionally been used to help with this problem. In this paper we introduce the notion of approximate word matching and show how it can be used to improve detection and categorization of variant forms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Approximate Record Matching

Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...

متن کامل

A Parallel Algorithm for Fixed-Length Approximate String-Matching with k-mismatches

This paper deals with the approximate string-matching problem with Hamming distance. The approximate string-matching with kmismatches problem is to find all locations at which a query of length m matches a factor of a text of length n with k or fewer mismatches. The approximate string-matching algorithms have both pleasing theoretical features, as well as direct applications, especially in comp...

متن کامل

Fast Convolutions and Their Applications in Approximate String Matching

We develop a method for performing boolean convolutions efficiently in word RAM model of computation, having a word size of w = Ω(log n) bits, where n is the input size. The technique is applied to approximate string matching under Hamming distance. The obtained algorithms are the fastest known. In particular, we reduce the complexity of the Amir et al. [1] algorithm for k-mismatches from O(n √...

متن کامل

The matching interdiction problem in dendrimers

The purpose of the matching interdiction problem in a weighted graph is to find two vertices such that the weight of the maximum matching in the graph without these vertices is minimized. An approximate solution for this problem has been presented. In this paper, we consider dendrimers as graphs such that the weights of edges are the bond lengths. We obtain the maximum matching in some types of...

متن کامل

Semantic processing survey of spoken and written words in adolescents with cerebral palsy: Evidence from PALPA word-picture matching test

Objective: The present study aimed to assess and compare semantic processing of spoken and written words in adolescents with cerebral palsy and healthy adolescents. Method: The present study is quantitative in terms of type and experimental in terms of method. Examination Group consisted 30 adolescents with cerebral palsy aged 10 to 15 years were selected by convenience sampling method. All of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997